Search for: All records

Creators/Authors contains: "Bader, David"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Linux and High-Performance Computing

https://doi.org/10.36227/techrxiv.175825944.48549074/v1

Bader, David A (September 2025, Techrxiv)

In the 1980s, high-performance computing (HPC) became another tool for research in the open (non-defense) science and engineering research communities. However, HPC came with a high price tag; the first Cray-2 machines, released in 1985, cost between $12 million and $17 million, according to the Computer History Museum, and were largely available only at government research labs or through national supercomputing centers. In the 1990s, with demand for HPC increasing due to vast datasets, more complex modeling, and the growing computational needs of scientific applications, researchers began experimenting with building HPC machines from clusters of servers running the Linux operating system. By the late 1990s, two approaches to Linux-based parallel computing had emerged: the personal computer cluster methodology that became known as Beowulf and the Roadrunner architecture aimed at a more cost-effective supercomputer. While Beowulf attracted attention because of its low cost and thereby greater accessibility, Roadrunner took a different approach. While still affordable compared to vector processors and other commercially available supercomputers, Roadrunner integrated its commodity components with specialized networking technology. Furthermore, these systems initially served different purposes. While Beowulf focused on providing affordable parallel workstations for individual researchers at NASA, Roadrunner set out to provide a multiuser system that could compete with the commercial supercomputers that dominated the market at the time. This paper analyzes the technical decisions, performance implications, and long-term influence of both approaches. Through this analysis, we can start to judge the impact of both Roadrunner and Beowulf on the development of Linux-based supercomputers.
more » « less
Full Text Available
Rocket-crane algorithm for the Feedback Arc Set problem

https://doi.org/10.1007/s13278-025-01491-2

Bader, David A; Ellis-Joyce, Justin; Both, Gert-Jan; Turaga, Srinivas C; Asoori_Sriram, Harinarayan; Chinthalapudi, Srijith; Du, Zhihui (December 2025, Social Network Analysis and Mining)

Understanding information flow in the brain can be facilitated by arranging neurons in the fly connectome to form a maximally “feedforward” structure. This task is naturally formulated as the Minimum Feedback Arc Set (MFAS)—a well-known NP-hard problem, especially for large-scale graphs. To address this, we propose the Rocket-Crane algorithm, an efficient two-phase method for solving MFAS. In the first phase, we develop a continuous-space optimization method that rapidly generates excellent solutions. In the second phase, we refine these solutions through advanced exploration techniques that integrate randomized and heuristic strategies to effectively escape local minima. Extensive experiments demonstrate that Rocket-Crane outperforms state-of-the-art methods in terms of solution quality, scalability, and computational efficiency. On the primary benchmark—the fly connectom—our method achieved a feedforward arc set with a total forward weight of 35,459,266 (about 85$$\%$$ $%$ ), the highest among all competing methods. The algorithm is open-source and available on GitHub.
more » « less
Full Text Available
On the Optimization of Methods for Establishing Well-Connected Communities

Dindoost, Mohammad; Rodriguez, Oliver Alvarado; Bryg, Bartosz; Park, Minhyuk; Chacko, George; Warnow, Tandy; Bader, David A (October 2025, The 14th International Conference on Complex Networks and Their Applications)

Community detection plays a central role in uncovering meso scale structures in networks. However, existing methods often suffer from disconnected or weakly connected clusters, undermining interpretability and robustness. Well-Connected Clusters (WCC) and Connectivity Modifier (CM) algorithms are post-processing techniques that improve the accuracy of many clustering methods. However, they are computationally prohibitive on massive graphs. In this work, we present optimized parallel implementations of WCC and CM using the HPE Chapel programming language. First, we design fast and efficient parallel algorithms that leverage Chapel’s parallel constructs to achieve substantial performance improvements and scalability on modern multicore architectures. Second, we integrate this software into Arkouda/Arachne, an open-source, high-performance framework for large-scale graph analytics. Our implementations uniquely enable well-connected community detection on massive graphs with more than 2 billion edges, providing a practical solution for connectivity-preserving clustering at web scale. For example, our implementations of WCC and CM enable community detection of the over 2-billion edge Open-Alex dataset in minutes using 128 cores, a result infeasible to compute previously.
more » « less
Full Text Available
Graph-Based Profiling of Dependency Vulnerability Remediation

Buschmann, Fernando Vera; Pauliuchenka, Palina; Oh, Ethan; Kao, Bai Chien; DiValentin, Louis; Bader, David A (August 2025, 6th International Conference on Science of Cyber Security (SciSec))

This research presents an enhanced Graph Attention Convolutional Neural Network (GAT) tailored for the analysis of open-source package vulnerability remediation. By meticulously examining control flow graphs and implementing node centrality metrics—specifically, degree, norm, and closeness centrality—our methodology identifies and evaluates changes resulting from vulnerability fixes in nodes, thereby predicting the ramifications of dependency upgrades on application workflows. Empirical testing on diverse datasets reveals that our model challenges established paradigms in software security, showcasing its efficacy in delivering comprehensive insights into code vulnerabilities and contributing to advancements in cybersecurity practices. This study delineates a strategic framework for the development of sustainable monitoring systems and the effective remediation of vulnerabilities in open-source software.
more » « less
Full Text Available
Wedge-Parallel Triangle Counting for GPUs

https://doi.org/10.1007/978-3-031-99872-0_1

Spaan, Jeffrey; Chen, Kuan-Hsun; Bader, David A; Varbanescu, Ana-Lucia (August 2025, Springer Nature Switzerland)

For fast processing of increasingly large graphs, triangle counting - a common building block of graph processing algorithms, is often performed on GPUs. However, applying massive parallelism to triangle counting is challenging due to the algorithm’s inherent irregular access patterns and workload imbalance. In this work, we propose WeTriC, a novel wedge-parallel triangle counting algorithm for GPUs, which, using fine(r)-grained parallelism through a lightweight static mapping of wedges to threads, improves load balancing and efficiency. Our theoretical analysis compares different parallelization granularities, while optimizations enhance caching, reduce work-per-intersection, and minimize overhead. Performance experiments indicate that WeTriC yields 5.63x and 4.69x speedup over optimized vertex-parallel and edge-parallel binary search triangle counting algorithms, respectively. Furthermore, we show that WeTriC consistently outperforms the state-of-the-art (i.e., on avg. 2.86x faster than Trust and 2.32x faster than GroupTC).
more » « less
Full Text Available
Cover Edge-Based Novel Triangle Counting

https://doi.org/10.3390/a18110685

Bader, David A; Li, Fuhuan; Du, Zhihui; Pauliuchenka, Palina; Rodriguez, Oliver Alvarado; Gupta, Anant; Minnal, Sai_Sri Vastav; Nahata, Valmik; Ganeshan, Anya; Gundogdu, Ahmet Cemal; et al (November 2025, Algorithms)

Counting and listing triangles in graphs is a fundamental task in network analysis, supporting applications such as community detection, clustering coefficient computation, k-truss decomposition, and triangle centrality. We introduce the cover-edge set, a novel concept that eliminates unnecessary edges during triangle enumeration, thereby improving efficiency. This compact cover-edge set is rapidly constructed using a breadth-first search (BFS) strategy. Using this concept, we develop both sequential and parallel triangle-counting algorithms and conduct comprehensive comparisons with state-of-the-art methods. We also design a benchmarking framework to evaluate our sequential and parallel algorithms in a systematic and reproducible manner. Extensive experiments on the latest Intel Xeon 8480+ processor reveal clear performance differences among algorithms, demonstrate the benefits of various optimization strategies, and show how graph characteristics, such as diameter and degree distribution, affect algorithm performance. Our source code is available on GitHub.
more » « less
Full Text Available
Evaluating Efficiency and Novelty of LLM-Generated Code for Graph Analysis

https://doi.org/10.1109/HPEC67600.2025.11196094

Nia, Atieh Barati; Dindoost, Mohammad; Bader, David A (September 2025, IEEE)

Large Language Models (LLMs) are increasingly used to automate software development, yet most prior evaluations focus on functional correctness or high-level languages such as Python. As one of the first systematic explorations of LLM-assisted software performance engineering, we present a comprehensive study of LLMs’ ability to generate efficient C implementations of graph-analysis routines—code that must satisfy stringent runtime and memory constraints. This emerging field of LLM-assisted algorithm engineering holds significant promise, as these models may possess the capability to design novel approaches that improve existing algorithms and their implementations. Eight state-of-the-art models (OpenAI ChatGPT o3 and o4-mini-high, Anthropic Claude 4 Sonnet and Sonnet Extended, Google Gemini 2.5 Flash and Pro, xAI Grok 3-Think, and DeepSeek DeepThink R1) are benchmarked using two distinct approaches. The first approach evaluates the ability of LLMs to generate algorithms that outperform existing benchmarks. The second approach assesses their capability to generate graph algorithms for integration into performance-critical systems. The results show that Claude Sonnet 4 Extended achieves superior performance in ready-to-use code generation and efficiency, outperforming human-written baselines in triangle counting. Although our findings demonstrate that contemporary LLMs excel in optimizing and integrating established algorithms, the potential for these models to eventually invent transformative algorithmic techniques represents a compelling frontier for future research. We provide prompts, generated code, and measurement scripts to promote reproducible research in this rapidly evolving domain. All of the source code is available on GitHub at https://github.com/Bader-Research/LLM-triangle-counting/.
more » « less
Full Text Available
Designing Parallel Algorithms for Community Detection using Arachne

https://doi.org/10.1109/HPEC67600.2025.11196647

Li, Fuhuan; Du, Zhihui; Bader, David A (September 2025, IEEE)

The rise of graph data in various fields calls for efficient and scalable community detection algorithms. In this paper, we present parallel implementations of two widely used algorithms: Label Propagation and Louvain, specifically designed to leverage the capabilities of Arachne, which is a Python-accessible open-source framework for large-scale graph analysis. Our implementations achieve substantial speedups over existing Python-based tools like NetworkX and igraph, which lack efficient parallelization, and are competitive with parallel frameworks such as NetworKit. Experimental results show that Arachne-based methods outperform these baselines, achieving speedups of up to 710x over NetworkX, 75x over igraph, and 12x over NetworKit. Additionally, we analyze the scalability of our implementation under varying thread counts, demonstrating how different phases contribute to overall performance gains of the parallel Louvain algorithm. Arachne, including our community detection implementation, is open-source and available at https://github.com/Bears-R-Us/arkouda-njit .
more » « less
Full Text Available
Graph-Based Profiling of Dependency Vulnerability Remediation

https://doi.org/10.1007/978-981-96-2417-1_8

Buschmann, Fernando Vera; Pauliuchenka, Palina; Oh, Ethan; Kao, Bai Chien; DiValentin, Louis; Bader, David A (January 2025, Springer Nature Singapore)

This research presents an enhanced Graph Attention Convolutional Neural Network (GAT) tailored for the analysis of open-source package vulnerability remediation. By meticulously examining control flow graphs and implementing node centrality metrics—specifically, degree, norm, and closeness centrality—our methodology identifies and evaluates changes resulting from vulnerability fixes in nodes, thereby predicting the ramifications of dependency upgrades on application workflows. Empirical testing on diverse datasets reveals that our model challenges established paradigms in software security, showcasing its efficacy in delivering comprehensive insights into code vulnerabilities and contributing to advancements in cybersecurity practices. This study delineates a strategic framework for the development of sustainable monitoring systems and the effective remediation of vulnerabilities in open-source software.
more » « less
Full Text Available
A Deployment Tool for Large Scale Graph Analytics Framework Arachne

https://doi.org/10.1109/HPEC62836.2024.10938515

Gonzalez-Rivas, Garrett; Du, Zhihui; Bader, David A (September 2024, 28th Annual IEEE High Performance Extreme Computing Conference (HPEC))

Data sets have grown exponentially in size, rapidly surpassing the scale at which traditional exploratory data analysis (EDA) tools can be used effectively to analyze real-world graphs. This led to the development of Arachne, a user-friendly tool enabling interactive graph analysis at terabyte scales while using familiar Python code and utilizing a high-performance back-end powered by Chapel that can be run on nearly any *nix-like system. Various disciplines, including biological, information, and social sciences, use large-scale graphs to represent the flow of information through a cell, connections between neurons, interactions between computers, relationships between individuals, etc. To take advantage of Arachne, however, a new user has to go through a long and convoluted installation process, which often takes a week or more to complete, even with assistance from the developers. To support Arachne’s mission of being an easy-to-use exploratory graph analytics tool that increases accessibility to high performance computing (HPC) resources, a better deployment experience was needed for users and developers. In this paper, we propose a tool specially designed to greatly simplify the deployment of Arachne for users and offer the ability to rapidly and automatically test the software for compatibility with new releases of its dependencies. The highly portable nature of Arachne necessitates that this deployment tool be able to install and configure the software in diverse combinations of hardware, operating system, initial system environment, and the evolving packages and libraries in Arachne. The tool was tested in both virtual and real-world environments, where its success was evaluated by an improvement to efficiency and productivity by both users and developers. Current results show that the installation and configuration process was greatly improved, with a significant reduction in the time and effort spent by both users and developers.
more » « less
Full Text Available

« Prev Next »